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J> (57) Abstract: A genetic selection system is described based on hybrid o54 RNAP activator proteins. The invention permits the 
^ detection and screening for protein:DNA and protein:protein binding events. 
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Selection System 

The present invention relates to a screening system useful for screening repertoires of 
DNA binding domains. In particular the invention relates to a screening system based on 
transcriptional activators of bacterial a54-dependent promoters. 

The majority of proteins involved in cellular functions do so by interacting with other 
proteins or nucleic acid sequences within the cell. Several approaches have been described 
that allow the in vivo selection of nucleic acids which express polypeptides capable of 
binding to proteins or DNA in the cell. Arguably the most powerful approaches are the 
yeast one- and two hybrid systems (Fields S. & Song O. 0989) Nature 340. 245; see US 
Patent 5,283,173, incorporated herein by reference in its entirety) for the screening of 
prptein-DNA and protein-protein interactions, respectively. However, the two-hybrid 
system requires an eukaryotic host and consequently the diversity that can be screened is 
limited. Furthermore the system notoriously suffers from an abundance of false positives. 

Larger molecular repertoires can be prepared in bacterial hosts and a number of bacterial 
systems for the screening of protein-protein and protein-DNA interactions have also been 
reported. Two systems have been put forward in which the polypeptide chain of an enzyme 
is expressed in two parts fused to two candidate polypeptides, and in which interaction 
between the candidate polypeptides reconstitutes the function of the enzyme (Karimova G. 
et al (1998) Proc. Nat. Acad. Sci USA 95, 5752; Pelletier J.N. et al (1998) Proc. Nat. 
Acad. Sci USA 95, 12141). 

Several in vivo screens for DNA-binding proteins have also been reported (reviewed in 
Mossing M.C., Bowie J.U. & Sauer R.T. (1991) Methods EnzymoL 208, 604; Elledge SJ. 
et al (1989) Proc. Nat. Acad Sci USA 86, 3689). Each of these methods involves the 
blockage of a hybrid cr70 promoter by the DNA binding protein. Repression of the 
promoter either prevents the production of conditionally toxic gene or alleviates repression 
of an antibiotic gene by transcriptional interference. The transcriptional interference assay 
(Elledge et al) has been used successfully in one case to select DNA binding proteins with 
altered specificity (Sera T. & Schultz P.G. (1996) Proc. Nat. Acad. Sci USA 93, 2920). 
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Another cr70-based system utilises recruitment of the polymerase to the promoter by way 
of a protein-protein interaction between a protein domain fused to the RN A polymerase 
asubunit and another fused to the lambda repressor bound immediately upstream of the 
RNA polymerase promoter binding site (Dove S.L.. Joung J.K.. Hochschild A. (1997), 
5 Nature 386, 627). By replacing the lambda repressor DNA binding domain with a library 
of Zn-finger domains, specific DNA binding Zn-finger domains were selected (Joung J.K., 
Ramm E.I.. Pabo CO. (2000) Proc Nad Acad Set USA y 97, 7382) 

The alternative holoenzyme form of bacterial RNA polymerase (RNAP) contains the o 54 
10 factor (a 54-RNAP). As has been previously shown ? this polymerase, in most cases, forms 
a closed complex with the promoter. Unlike a70 promoters at which the RNA polymerase 
is bound in an active form and is largely controlled by repression, the a54 RNA 
polymerase holoenzyme is transcriptionally incompetent and is unable to initiate 
transcription by itself. Initiation of transcription requires the presence of a transcriptional 
15 activator that catalyses the isomerisation of the closed promoter complex to an open one. 
Typically, activator proteins bind to a specific upstream activation sequence (UAS) located 
80 to 200 bp upstream of the o 54 core promoter. The function of the UAS is to tether the 
activator in the right position and to bring it in the vicinity of the promoter in order to 
increase the efficiency of interaction between the a 54 RNAP and the activator. 
20 Transcriptional activators of a54 dependent promoters have been called bacterial 
enhancers because their mechanism of activation is superficially similar to the activation of 
transcription by enhancer proteins in eukaryotes (Kustu S. et al (1991) Trends Biochem Set 
16, 397). 

25 Conversion of the a 54 RNAP into an active form is catalysed by the binding of an 
enhancer protein coupled to hydrolysis of ATP. This unusual mechanism accounts for the 
low level of background transcription and the enormous difference (10 4 -10 5 ) between on 
and off states in the strongest a54 promoters effected by a single factor. In comparison, 
activators of o70 promoters such as CAP or Xcl increase transcription levels usually by 

30 less than 10-fold. 
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Transcriptional activators of c54 promoters (also known as enhancer-binding proteins or 
EBPs) share a common structure (see Morrett and Segovia. (1993) J. Bacteriol. 6067- 
6074) comprising a non-conserved N-terminal domain which has a putative regulatory 
function, a central domain which is responsible for transcriptional activation, and a C- 
5 terminal DNA binding domain which binds the relevant UAS in the target gene. The 
domains are modular: the central and N-terminal domains together are capable of 
constitutive activation of c54 RNAP when overexpressed. At least in some cases, the 
isolated DNA binding domain is capable of specifically binding its DNA recognition site. 

10 In many instances, interaction between a54 RNAP and the activator is enhanced by a 
cellular factor which promotes DNA bending between the UAS and the a54 promoter 
(Freundlich et aL (1992) Mol. Microbiol. 6:2557-2563). This factor, known as integration 
host factor (IHF) acts to promote transcription from a54 promoters. 



1 5 Summary of the Invention 

We provide herein a novel screening system which is based on transcriptional activators of 
a54-based promoters. 

20 According to a first aspect of the invention, therefore, there is provided a method for 
detecting a protein-nucleic acid interaction between a acid molecule and a protein 
molecule, comprising the steps of: 

a) providing one or more hybrid a54 activator proteins comprising a heterologous 
nucleic acid binding sequence and a constitutively active a54 transcription activating 

25 domain; 

b) providing one or more nucleic acid molecules comprising a binding site for the 
nucleic acid binding sequence and a binding site for a54 RNAP, which directs the 
expression of a reporter gene and leads to upreguiation thereof in response to activation by 
the a54 transcription activating domain; and 

30 c) detecting expression of the reporter gene. 
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The invention provides a reporter system which is characterised by very low levels of 
background expression, since the a54 polymerase is transcriptionally incompetent in the 
absence of a a54 transcriptional activator. Since, at physiological concentrations, the 
binding of the transcriptional activator to the nucleic acid is required in order to activate 
5 transcription by a54 RNAP. the system of the invention may be used as a tool for 
investigating and/or screening protein/nucleic acid interactions exploiting the reporter gene 
read-out. 

In the first aspect of the invention, either the nucleic acid binding protein or the nucleic 
10 acid molecule may be provided in the form of a repertoire of molecules. Repertoires of 
hybrid a54 activator proteins preferably are partially or completely randomised at least in 
the heterologous nucleic acid binding sequence. This allows selection from the library of 
molecules having desired nucleic acid binding characteristics, 

15 Repertoires of nucleic acid molecules advantageously are partially or completely 
randomised in the binding site for the nucleic acid binding sequence of the a54 activator 
protein. This allows selection of nucleic acid molecules having desired binding sites for 
the chimeric activators. 



20 In a second aspect of the invention, there is provided a system for selecting protein-protein 
interactions based on the constitutively active hybrid a54 activators described above. The 
system according to the invention is conceptually similar to the yeast two-hybrid system. 

Accordingly, there is provided a method for detecting a protein-protein interaction, 
25 comprising the steps of: 

a) providing a first hybrid protein comprising a nucleic acid binding sequence and a 

first polypeptide sequence bait; 

b) providing a second hybrid protein comprising a prey polypeptide sequence and 
constitutively active a54 transcription activating domain; 

30 c) providing a nucleic acid molecule comprising a binding site for the nucleic acid 

binding sequence and binding site for a54 RNAP which directs the expression of a 
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reporter gene and leads to upregulation thereof in response to activation by the a54 
transcription activating domain: 

d) incubating the first and second hybrid proteins together with the nucleic acid 
molecule such that the prey and bait polypeptide sequences may bind, thereby forming a 

5 hybrid protein comprising both, a nucleic acid binding sequence and a g54 transcription 
activating domain; and 

e) detecting expression of the reporter gene. 

As will be apparent to those skilled in the art, reference to a "binding site" for the nucleic 
10 acid binding sequence includes the provision of several appropriately spaced binding sites 
in the nucleic acid molecule. 

As with the yeast two-hybrid system, in which a modular transcription factor is assembled 
though binding of DNA binding domain/bait and transcription activating domain/prey 

15 hybrids, the association of the nucleic acid binding sequence and the a54 transcription 
activating domain through the bait/prey interaction allows the detection of, and screening 
for, protein-protein binding interactions In vivo and in vitro. Advantageously, the bait 
and/or prey polypeptide sequences are provided in the form of repertoires, which may be 
partially or completely randomised. This allows selection of prey polypeptides based on 

20 their ability to form interactions with a desired bait (or vice versa). As the assay may be 
conducted in vivo, in a bacterium, the invention permits the detection of in vivo binding 
interactions between polypeptides in bacteria. 

It will be apparent that the hybrid proteins useful in the methods of the invention are 
25 advantageously provided in the form of nucleic acid vectors or libraries thereof capable of 
expressing said proteins in a host bacterium. Advantageously, the vectors) include first 
and second chimeric genes which encode the hybrid proteins of the invention. Preferably, 
the vectors also include means for replication in bacteria. Also included may be one or 
more marker genes, the expression of which in the bacterium permits selection of cells 
30 containing the vector(s) from cells that do not contain the vectors). Preferably, the 
vector(s) are plasmid(s). 
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In a third aspect, the invention provides a method for screening a repertoire of 
candidate DNA-bending polypeptides, comprising the steps of: 

a) providing a repertoire of candidate polypeptide factors with potential to induce 
bending of DN A; 

5 b) providing a a54 activator protein comprising a nucleic acid binding sequence 

and a a54 transcription activating domain; 

c) providing a nucleic acid molecule comprising a binding site for the nucleic acid 
binding sequence and binding site for c54 RNAP which directs the expression of a 
reporter gene and leads to upregulation thereof in response to activation by the a54 

10 transcription activating domain; 

d) incubating the repertoire and a54 activator together with the nucleic acid 
molecule in a HIF" host cell, such that a54 activator and the nucleic acid molecule may 
interact, and transcription activated from the a54 RNAP binding site in a manner 
dependent on DNA bending by the polypeptide factors; and 

1 5 e) detecting expression of the reporter gene. 

It is known that activation by a54 activators may be regulated by factors which induce 
DNA bending in the target gene. For example, the host factor IHF is known to potentiate 
g54 activation; moreover, it may be replaced by alternative DNA bending polypeptides, or 
20 by intrinsically bent DNA. 

The invention moreover provides methods for development of improved a54 activator- 
based tools. 

25 The first chimeric gene includes a nucleic acid sequence that encodes a nucleic-binding 
domain and a first (bait) test protein or protein fragment in such a manner that the first test 
protein is expressed as part of a hybrid protein with the nucleic acid-binding domain. 

The second chimeric gene also includes a promoter and a transcription termination signal 
30 to direct transcription. The second chimeric gene moreover includes a nucleic acid 
sequence that encodes a ct54 transcriptional activation domain and a second (prey) test 
protein or protein fragment into the vector, in such a manner that the second test protein is 
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capable of being expressed as part of a hybrid protein with the transcriptional activation 
domain. 

The invention moreover provides kits for practising the invention, which kits 
5 advantageously comprise a container, two vectors, and a host cell. The first vector contains 
' a promoter and may include a transcription termination signal functionally associated with 

the first chimeric gene in order to direct the transcription of the first chimeric gene. The 

chimeric gene advantageously comprises one or more unique restriction site(s) to insert a 

nucleic acid sequence encoding a test bait polypeptide. The kit also may also include a 
10 second vector which contains a second chimeric gene, optionally comprising one or more 

unique restriction site(s) to insert a nucleic acid sequence encoding the prey polypeptide; 

alternatively, the second chimeric gene may be present on the same vector as the first 

chimeric gene. 

15 Brief description of the Figures 

Figure 1 A is a schematic representation of o~54 RNAP activation by o54 activator NifA. 

Figure IB is a schematic representation of the invention, in which the a54 DNA binding 
20 domain is replaced with a heterologous GCN4 DNA binding domain.. 

Figure 2 is a schematic representation of the first aspect of the present invention, in which 
a library of DNA binding domains is screened together with a library of DNA binding 
domain binding sites to identify protein:DNA binding pairs. 

25 

Figure 3 shows the activation of transcription by NifA-chimera as expressed as percent of 
wt activity (NifA/UAS). Nif-GCN4 (in presence of the NifAAC coactivator (NifADC)) 
show close to wt activity. Equal activity is observed for the two distinct GCN$ DNA 
recognition sites (ATF/Creb and AP-1). Less than 1% wt activity is observed with a non- 
30 cognate reporter such as one bearing the wt nifH UAS. 
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Figure 4 shows the activation of transcription by NifA-chimera as expressed as percent of 
vvt activity (NifA/UAS). Nif-ERDBD (in presence of the NifAAC coactivator (Ni£ADC)) 
shows ca. 80% of wt activity. Very little activation is observed with a non-cognate reporter 
bearing the DNA recognition site (GRE) for the closely related Glucocorticoid receptor. 

5 

Figure 5 shows the coactivation by different NifA variants as expressed, as percent of wt 
activity (NifAwt). NifA from Klebsiella pneumoniae (NifAKp) is superior to al others, 
even exceeding vvt activity (up to 160%), NifAKp with its DNA domain deleted 
(NifAACKp (NIfADCKp)) is almost as active. 

10 

Detailed Description of the Invention 

Unless defined otherwise, all technical and scientific terms used herein have the same 
1 5 meaning as commonly understood by one of ordinary skill in the art (e.g., in bacterial cell 

culture, molecular genetics, nucleic acid chemistry, protein chemistry and biochemistry). 

Standard techniques are used for molecular, genetic and biochemical methods (see 

generally, Sambrook et a/., Molecular Cloning: A Laboratory Manual, 2d ed. (1989) Cold 

Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. and Ausubel et al, Short 
20 Protocols in Molecular Biology (1999) 4 lh Ed, John Wiley & Sons, Inc. which are 

incorporated herein by reference), chemical methods, pharmaceutical formulations and 

delivery and treatment of patients. 

A: Nucleic Acids and Proteins 

25 

As used herein, "nucleic acid" refers to any natural nucleic acid, including RNA and DNA 
as well as synthetic nucleic acid comprising modified or synthetic bases, and mixtures of 
modified or synthetic bases with natural bases. Such modified and/or synthetic bases may 
be referred to as derivatives of DNA or RNA. Preferably, "nucleic acid" refers to DNA. 
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The invention includes the use of modified and/or artificial ''nucleic acids". A number of 
modifications have been described that alter the chemistry of the phosphodiester 
backbone, sugars or heterocyclic base components of nucleic acids. 

5 Among useful changes in the backbone chemistry are phosphorothioates; 
phosphorodithioates. where both of the non-bridging oxygens are substituted with 
sulphur; phosphoroamidites: alkyl phosphotriesters and boranophosphates. Achiral 
phosphate derivatives include jVO'o'-S-phosphorothioate, S'-S-S'-O-phosphorothioate, 
3 , -CH2-5 , -0-phosphonate and 3'-NH-5 , -0-phosphoroamidate. Peptide nucleic acids 

10 replace the entire phosphodiester backbone with a peptide linkage. 

Sugar modifications are also known. The a-anomer of deoxyribose may be used, where 
the base is inverted with respect to the natural p-anomer. The 2'-OH of the ribose sugar 
may be altered to form T-O-methyl or 2'-0-allyl sugars, which provides resistance to 
1 5 degradation without comprising affinity. 

Modification of the heterocyclic bases must maintain proper base pairing. Some useful 
substitutions include deoxyuridine for deoxythymidine; 5-methyl-2'-deoxycytidine and 
5-bromo-2'-deoxycytidine for deoxycytidine. S-propynyl-T-deoxyuridine and 
20 5-propynyl-2'-deoxycytidine have been shown maintain biological activity when 
substituted for deoxythymidine and deoxycytidine, respectively. 

As used herein, the term "protein" includes single-chain polypeptide molecules as well as 
multiple-polypeptide complexes where individual constituent polypeptides are linked by 

25 covalent or non-covalent means. As used herein, the terms "polypeptide" and "peptide" 
refer to a polymer in which the monomers are amino acids and are joined together through 
peptide or disulphide bonds. The term domain also refers to polypeptides and peptides 
having biological function. A peptide useful in the invention will have a binding or 
transcription activating capability, i.e., with respect to binding to nucleic acids, other 

30 proteins or polypeptides, and activation of a54 RNAP transcription. It also may have 
another biological function that is a biological function of a protein or domain from which 
the peptide sequence is derived. 
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A hybrid protein is a protein or polypeptide which comprises constituent parts derived 
from at least two naturally-occurring or artificial proteins. In particular, it may comprise 
the DNA-binding domain of one protein and the protein-binding or transcription activating 
5 domain of a second protein. 

B: <j54 Activators 

Activators of a54 transcription are well known and have been reviewed, for example, by 
10 Buck et ai. J Bacterid. 2000 Aug;182(15):4129-36; Studholme and Buck, FEMS 

Microbiol Lett. 2000 May I ;1 86(1); 1-9; Shingler, Mol Microbiol. 1996 Feb;19(3):409-16; 

Goosen and van der Putte, Mol Microbiol. 1995 Apr;l6(l):l-7; Merrick, Mol Microbiol. 

1993 Dec;10(5):903-9; and others. A family of such activator proteins has been defined, 

and its members found to share homology in the central (catalytic) domain which is 
1 5 responsible for o54 RNAP activation. 

Members of the family include the following (the numbers are GenBank accession 
numbers) 

20 dbj|BAAl6379.1| (D90877) FORMATE HYDROGENLYASE TRANSCRIPTIONAL 
ACTIVATOR. 

emb|CAA26472.1| (X02616) pot. NifA gene product (aa 1-484) [Klebsiella pneumoniae] 

emb|CAA53584.1| (X75972) anfA [Rhodobacter capsulatus] 

emb|CAA92413.1| (Z68203) NifA homologue [Rhizobium sp.] 
25 emb|CAA93242. 1 1 (Z6925 1) MopR [Acinetobacter calcoaceticus] 

emb|CAB53 1 57. 1 1 (X07567) NifAl [Rhodobacter capsulatus] 

emb|CAB56537.1| (AJ249642) response regulator [Pseudomonas stutzeri] 

gb|AAA58220.1| (U18997) ORF_o532 [Escherichia coli] 

gb|AAA99303.1| (L43064) regulatory protein [Pseudomonas aeruginosa] 
30 gb|AAB9 1 397. 1 1 (AF033203) NifAn protein [Rhodobacter capsulatus] 

gb|AAC05586.1| (AF006075) regulatory protein [Bacillus subtilis] 

gb|AAC37124.l| (L81 176) FleQ [Pseudomonas aeruginosa] 
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gb!AAC45640.1| (AFO 10585) putative sigma 54 activator [Caulobacter crescentus] 
gb|AAC46367.1| (AFO 14 1 13) two-component response regulator [Vibrio cholerae] 
gb|AAD3459l.l|AF145956J (AF145956) transcriptional activator NifA 

[Rhodospirillum rubrum] 
5 gb|AAD384 16.1 1 (AF1 55934) NifA [Alcaligenes faecalis] 
gb|AAF28395.l| (AF069392) FiaM [Vibrio parahaemolyticus] 

eb|AAF33506.1| (AF170176) Salmonella typhimuriurn transcriptional regulatory protein 
gb|AAF61932.1| (AF230804) sigma-54 activator protein Act! [Myxococcus xanthus] 
gb|AAF85342.1|AE004061J7 (AE004061) two-component system, regulatory protein 

10 [Xylella fastidiosa] 

gb|AAF94676. 1 1 ( AE004230) sigma-54 dependent response regulator [Vibrio cholerae] 
gb|AAF95280.1| (AE004286) sigma-54 dependent response regulator [Vibrio cholerae] 
gb|AAF96095.1| (AE004358) sigma-54 dependent transcriptional regulator [Vibrio 
cholerae] 

15 gb|AAG01527.1|AF288483 J (AF288483) NifA [Azospirillura brasilense] 

pir||A48291 ornithine decarboxylase inhibitor - Escherichia coli 

pir||B49940 nitrogen regulator I homolog - Escherichia coli 

pir||C70320 transcription regulator NifA family - Aquifex aeolicus 

pir||C70396 transcription regulator NtrC family - Aquifex aeolicus 
20 pirj|C70454 transcription regulator NtrC family - Aquifex aeolicus 

pir||D70315 transcription regulator NtrC family - Aquifex aeolicus 

pir||H6958 1 transcription activator of acetoin dehydrogenase operon acoR - Bacillus 

subtilis 

pir||I39719 nitrogen regulatory protein - Agrobacterium tumefaciens 

25 pir||JC547 1 regulatory protein NifA - Azospirillum lipoferum 

pir||T08624 probable NtrC-type response regulator - Eubacterium acidaminophilum 
sp|P03027|NIFA JCLEPN NIF-SPECIFIC REGULATORY PROTEIN 
sp|P09570|NIFA_AZOVINIF-SPECIFIC REGULATORY PROTEIN 
sp|P12627|VNFA_AZOVI NITROGEN FIXATION PROTEIN VNFA 

30 sp|P14375|HYDG_ECOLI TRANSCRIPTIONAL REGULATORY PROTEIN HYDG 
sp|P21712|YFHA_ECOLI HYPOTHETICAL 49.1 KD PROTEIN IN GLNB-PURL 
INTERGENIC REGION (ORFXB) 
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sp|P24426|NIFA_RHILT NIF-SPECIFIC REGULATORY PROTEIN 
spiP25852|HYDG_SALTY TRANSCRIPTIONAL REGULATORY PROTEIN HYDG 
sp|P27713|NIFA_HERSE NIF-SPECIFIC REGULATORY PROTEIN 
sp|P30667|NIFA_AZOBR NIF-SPECIFIC REGULATORY PROTEIN 

5 sp|P38035|RTCR_ECOLI TRANSCRIPTIONAL REGULATORY PROTEIN RTCR 
sp|P54929|NIFA_AZOLI NIF-SPECIFIC REGULATORY PROTEIN 
sp|P56266(NIFA_KLEOX NIF-SPECIFIC REGULATORY PROTEIN 
sp|Q06065|ATOC_ECOLI ACETOACETATE METABOLISM REGULATORY 
PROTEIN ATOC (ORNITHINE/ ARGININE 

10 sp|Q46802|YGEV_ECOLI HYPOTHETICAL SIGMA-54-DEPENDENT 
TRANSCRIPTIONAL REGULATOR IN 

sp|Q532061NIFA_RHISN NIF-SPECIFIC REGULATORY PROTEIN 
sp|Q9ZIB7|TYRR_ERWHE TRANSCRIPTIONAL REGULATORY PROTEIN TYRR 

15 Moreover, a number of polypeptides belonging to the o54 activator family have been 
described whose 3D structures are known. These include: 113161 acetoin catabolism 
regulatory protein; 113629 alginate biosynthesis transcriptional regulatory protein ALGB; 
266789 type 4 fimbriae expression regulatory protein PILR; 113833 nitrogen fixation 
protein ANFA; 138884 nitrogen fixation protein VNFA; 128219 nif-specific regulatory 

20 protein; 3024194 nif-specific regulatory protein; acetoacetate metabolism regulatory 
protein ATOC 1168553 (onuthine/arginine decarboxylase inhibitor) (ornithine 
decarboxylase antizyme); 417166 transcriptional regulatory protein HYGD; 266622 nif- 
specific regulatory protein; 1352500 nif-specific regulatory protein; 128224 nif-specific 
regulatory protein; 128225 nif-specific regulatory protein; 128221 nif-specific regulatory 

25 protein; 128226 nif-specific regulatory protein; 1346014 transcriptional regulatory protein 
FLBD; 549560 hypothetical sigma-54-dependent transcriptional regulator in GUTQ-HYPF 
intergenic region; 139857 transcriptional regulatory protein XYLR (67 kd protein); 120053 
formate hydrogenlyase transcriptional activator; 2507375 hypothetical 49.1 kd protein in 
GLNB-PUR1 intergenic region (ORFXB) (orf-2); 134961 signal-transduction and 

30 transcriptional-control protein; 1171795 nitrogen assimilation regulatory protein; 417388 
nitrogen regulation protein nr(i); 123466 hydrogenase transcriptional regulatory protein 
HOXA; 399925 hydrogenase transcriptional regulatory protein HOXA; 585586 nitrogen 
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assimilation regulatory protein NTRX: 118399 c4-dicarboxylate transport transcriptional 
regulatory protein DCTD: 585267 pathogenicity locus probable regulatory protein HRPR; 
1346313 pathogenicity locus probable regulatory protein HRPS; 549447 pathogenicity 
locus probable regulatory protein WTSA: 585909 arginine utilization regulatory protein 
ROCR; 136600 transcriptional regulatory protein TYRR; 1174836 transcriptional 
regulatory protein TYRR homplog: 123748 hydrogenase transcriptional regulatory protein 
HUPRl; 128604 nitrogen regulation protein NTRC; 1 169293 glycerol metabolism operon 
regulatory protein; and 129957 phosphoglycerate transport system transcriptional 
regulatory' protein PGTA . The numbers are GenBank gi numbers. 



Preferably, the hybrid a54 activator is based on the NifA activator. The Nif family of 
bacterial enhancers regulate expression of nitrogenase components from o54 promoters in 
nitrogen-fixing bacteria, and are inhibited by NifL (Austin S , et al (1994) J. Bacterial. 176, 
3460). In bacteria lacking NifL, NifA is constitutively active. NifA is modular in 
1 5 architecture and it is shown herein that this allows for the swapping of the natural DNA- 
binding domain (DBD) for heterologous DBDs. Such NifA-DBD chimaeras are inactive 
on the wild type promoter, but activate transcription from hybrid promoters bearing their 
cognate target sequences. 

20 Advantageously, the hybrid a54 activator may be based on £ coli PspF (see Jovanovic et 
■ al., (1996) J. Bacteriol. 178:1936-1945). PspF lacks the N-terminal regulatory domain 
typical of a54 activators, and is constitutively active but negatively regulated by PspA. 
Thus, in bacteria lacking PspA, PspF is constitutively active. 

25 Other o54 activators may be rendered constitutively active by removal of the N-terminal 
regulatory domain or by appropriate mutation. 

Nucleic acid binding sequences or domains are known in the art and may be derived from 
cr54 activator proteins or any other DNA binding proteins, whether naturally-occurring or 
30 synthetic. Moreover. DNA-binding domains may be synthesised by partial or complete 
randomisation. Many naturally-occurring DNA-binding proteins contain independently 
folded domains for the recognition of DNA. and these domains in turn belong to a large 



WO 01/18244 PCT/GBOO/03450 

14 

number of structural families, such as the leucine zipper, the homeodomain, the "helix- 
turn-helix", the zinc finger and various other transcription factor families. 

G: Libraries 

5 

The term library refers to a mixture of heterogeneous polypeptides or nucleic acids. The 
library is composed of members/which have a unique polypeptide or nucleic acid 
sequence. To this extent, library is synonymous with repertoire, although in general the 
term "library" is used herein to denote the source of the repertoire - e.g. a library of 

10 nucleic acid molecules which encodes a repertoire of polypeptides. Sequence differences 
between library members are responsible for the diversity present in the library. The 
. library may take the form of a simple mixture of polypeptides or nucleic acids, or may be 
in the form organisms or cells, for example bacteria, viruses, animal or plant cells and the 
like, transformed with a library of nucleic acids. Advantageously, the nucleic acids are 

15 incorporated into expression vectors, in order to allow expression of the polypeptides 
encoded by the nucleic acids. In a preferred aspect, therefore, a library may take the form 
of a population of host organisms, each organism containing one or more copies of an 
expression vector containing a single member of the library in nucleic acid form which 
can be expressed to produce its corresponding polypeptide member. Thus, the population 

20 of host organisms has the potential to encode a large repertoire of genetically diverse 
polypeptide variants. 

Libraries of hybrid proteins may be prepared and selected together with libraries of hybrid 
nucleic acids. "Crossing" of hybrid libraries is performed by combinatorial infection, 
25 which has been employed successfully to generate very large antibody libraries (Griffiths 
et al (1994) EMBO J. 13,3245). 

Although libraries for use in the present invention may be phage libraries, as is known in 
the art, it is possible to use alternative libraries which are constructed using other vectors, 
30 such as plasmids. In any case, the present invention does not require the library to be 
capable of "display" of the gene product at the bacterial surface, as with phage libraries; 
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rather, the gene product is preferable expressed intracellularly. and is advantageously not 
expressed as a fusion with a vector gene product. 

DNA binding domain libraries are preferably based on a known DNA binding domain 
5 architectures (e.g. basic leucine zipper, bZIP) and may be derived using PCR 
amplification with "family-specific" primers. Such libraries may be crossed with hybrid- 
promoters bearing defined target sequences or libraries of target sequences. In addition to 
providing information on the distribution of members of the family in a given genome, 
such libraries may be used to identify and study proteins or molecular compounds that 
10 modify DNA interaction within a family of DNA binding domains, for example Tax 
(from HTLV- 1 ) in the case of bZIP proteins. 

In an alternative embodiment, they may also be used to select DNA binding domains 
which conditionally bind their target sequence only in the presence of other factors such 

15 as protein, cofactors or small molecular compounds, for example drugs that intercalate 
into DNA or alter the degree of supercoiling or recognise DNA sequences which have 
been modified chemically (e.g. methylated). The system can also be used "in reverse" i.e. 
to select proteins or molecular compounds that disrupt a particular DNA-protein 
interaction or to select DNA binding domains that do not bind a particular target sequence 

20 or library thereof. 

More advanced libraries are preferably derived directly from genomic DNA or cDNA 
libraries and selected on hybrid promoters bearing a repertoire of target sequences, 
comprising either a stretch of randomised sequence or a library of inserts derived from 
25 fragmented genomic DNA. Data obtained in this way allows the compilation of a genomic 
directory of DNA binding domains and the building of a promoter-DNA binding domain 
interaction map, 

D: Hybrid polypeptides 

30 

The generation of hybrid polypeptides by domain fusion is well known in the art and may 
be effected by fusing polypeptides or, preferably, by fusing nucleic acids which encode 
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the polypeptides. It has been known since 1976 that DNA binding and transcriptional 
activator domains are separable, and can be swapped between proteins; see Ma and 
Ptashne. who reported (Cell, (1987) 51, 1 13-1 19; Cell. (1988)55, 443-446) that when both 
the GAL4 N-terminal domain and C -terminal domain are fused together in the same 

5 protein, transcriptional activity is induced. Other proteins are also known function as 
transcriptional activators via the same mechanism. For example, the GCN4 protein of 
Saccharomyces cerevisiae as reported by Hope and Struhl, Cell. 46, 885-894 (1986), the 
ADR1 protein of Saccharomyces cerevisiae as reported by Thukral et aL Molecular and 
Cellular Biology, 9, 2360-2369, (1989) and the human estrogen receptor, as discussed by 

10 Kumar et aL, Cell, 51, 941-951 (1987) both contain separable domains for DNA binding 
and for maximal transcriptional activation. 

The same is specifically known of the cr54 bacterial transcriptional activators, although a 
genetic screen based thereon has not been proposed. Therefore, the present invention may 
15 be carried out using techniques which are known to those skilled in the art, particularly as 
applied to 2-hybrid techniques in eukaryotic cells. 

Synthesis of chimeric genes for the purposes of the present invention may be carried out 
by any desired means, including polynucleotide synthesis and mutagenesis approaches. 

20 For example, a number of methods for site-directed mutagenesis are known in the art, 
from methods employing single-stranded phage such as Ml 3 to PCR-based techniques 
(see "PGR Protocols: A guide to methods and applications", M.A. Innis, D.H. Gelfand, 
J.J. Sninsky, T.J. White (eds.). Academic Press, New York, 1990). Preferably, the 
commercially available Altered Site II Mutagenesis System (Promega) may be employed, 

25 according to the directions given by the manufacturer. 

E: Host Cells 

Host cells useful in conjunction with the present invention are prokaryotic cells, 
30 advantageously bacterial cells. E. coli is the preferred host; however, host cells may 
belong to any species or genus in which a54 RNAP-driven transcription is possible, such 
as Klebsiella, Rhodobacter, Rhizobium, Acinetobacter, Pseudomonas. Escherichia. 
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Bacillus, Caulobacter, Vibrio, Rhodospirillum. Alcaligenes, Salmonella. Myxococcus, 
Xylella, Azospirillum, Aqutfex, Agrobacteriwn and other organisms. In E.colL the 
preferred configuration is a modified strain, in which a truncated form of Nif (or another 
activator) is coexpressed to boost specific activation (see Methods). 

5 

Preferably, the host cells lack repressors of the o54 activator being used, such that the 
transcription activating domain is constitutively active. Repressors may be deleted by 
genetic mutation and/or selection, or inhibited by expression of antisense constructs, or 
the like. In general, due to the accessibility of bacterial genetics, especially in £ coli y 
10 deletion of repressor genes is straightforward to those skilled in the art. 

F: Reporter Genes 

Reporter genes of various types are known in the art and may be used in conjunction with 
15 the present invention. A "reporter gene", as referred to herein, may be the coding 
sequence which encodes a detectable gene product, or the coding sequence including the 
necessary control sequences for its expression in accordance with the invention, as 
appropriate. 

20 Advantageously, the reporter gene is selected from, the group consisting of metabolic 
markers such as the lac operon (lacZ, lacY and lacA); proteins conferring a fluorescent 
phenotype, such as GFP; proteins conferring antibiotic resistance, such as Zeo; and 
proteins conferring another selectable property. 

25 Certain reporters, such as the LacZ gene, are widely used in bacterial genetics and are 
useful in the performance of the invention. However, other genes may also be employed, 
including fluorescent proteins. For example, green fluorescent proteins (GFPs) of 
cnidarians, which act as their energy-transfer acceptors in bioluminescence, can be used in 
the invention. A green fluorescent protein, as used herein, is a protein that fluoresces 

30 green light, and a blue fluorescent protein is a protein that fluoresces blue light GFPs 
have been isolated from the Pacific Northwest jellyfish, Aequorea victoria, from the sea 
pansy, Renilla reniformis, and from Phialidium gregarium. (Ward et al., 1982, 
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Photochem. PhotobioL 35: 803-808; Levine et al., 1982. Comp. Biochem. PhvsioL72B: 
77-85). 

A variety of Aequorea-reldXed GFPs having useful excitation and emission spectra have 
5 been engineered by modifying the amino acid sequence of a naturally-occurring GFP from 
Aequorea victoria. (Prasher et aL 1992, Gene, 111: 229-233; Heim et al., 1994, Proc, 
Natl. Acad. Sci. U.S.A.. 91: 12501-12504; PCT/US95/14692). As used herein, a 
fluorescent protein is an Aequorea-related fluorescent protein if any contiguous sequence 
of 150 amino acids of the fluorescent protein has at least 85% sequence identity with an 
10 amino acid sequence, either contiguous or non-contiguous, from the wild-type Aequorea 
green fluorescent protein (SwissProt Accession No. P42212). Similarly, the fluorescent 
protein may be related to Renilla or Phialidium wild-type fluorescent proteins using the 
same standards. 

15 Aequorea-reteied fluorescent proteins include, for example, wild-type (native) Aequorea 
victoria GFP, whose nucleotide and deduced amino acid sequences are presented in 
GenBank Accession Nos. L29345, M62654, M62653 and others ^?worea-related 
engineered versions of Green Fluorescent Protein, of which some are listed above. Several 
of these, i.e., P4, P4-3, W7 and W2 fluoresce at a distinctly shorter wavelength than wild 

20 type. 

A specific advantage of fluorescent proteins is that they facilitate FACS sorting of cells in 
a manner dependent on reporter gene expression (Norman, S.O. (1980). Flow cytometry. 
Med Phys. 7, 609-615; Mackenzie, KM. & Pinder, A.C. (1986). The application of flow 
25 microfluorimetry to biomedical research and diagnosis: a review. Dev. Biol. Stand. 64, 
181-193). 

Other reporter genes may complement auxotrophic mutations, confer antibiotic resistance 
or other selectable characteristics to the host bacteria. Reporter genes may be wholly or 
30 partly heterologous to the host cell, and introduced by mutagenesis and/or transformation 
with appropriate vectors. Alternatively, endogenous <r54-responsive genes may be used 
as reporter genes. 
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The reporter gene also contains a binding site for a54 RNAP. The consensus sequence 
for a54 RNAP binding is 5' TGGCAC-N5-TTGCa/t 3*. This sequence is located at -12 to 
-24 with respect to the start of transcription, whilst the more common sigma 70 
5 recognition sequence is situated at -10 to/ -35. Both the GG & GC must be on the same 
' face of the DN A helix. 

In order to increase specificity, combinations of two or more reporter genes (preferably in 
tandem) may be used. 

10 

Where the reporter gene is chimeric, i.e. comprises heterologous binding sites for the 
nucleic acid binding sequence and a54 RNAP binding sites incorporated into the same 
nucleic acid, the spacing between the ct54 RNAP binding site and the nucleic acid binding 
sequence binding site is preferably conserved with respect to the natural gene from which 
15 the a54 RNAP binding site is taken. Advantageously, the spacing is at least calculated 
such that the spatial relationship of the elements on respective faces of the nucleic acid 
helix is maintained. 

Reporter genes advantageously comprise a binding site for a further activation factor, such 
20 as MF. These factors are believed to induce bending of the DNA ? thus potentiating 
activation of a54 RNAP-driven transcription by a54 activators. Alternatively, the DNA 
itself may be intrinsically bent thus providing constitutive potentiation of a54-specific 
activation. 

25 G: Configurations of the Invention 

The present invention may be configured in three basic ways: a first configuration, in 
which reporter gene activation is dependent on the interaction between the nucleic acid 
and a nucleic acid binding domain on the hybrid protein; a second configuration, in which 
30 reporter gene activation is dependent on interaction between bait and prey polypeptides 
which serves to bring together two or more components of the hybrid protein; and a third 
configuration, in which reporter gene activation by a a54 activator is dependent on the 
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presence of DNA-bending polypeptides. As referred to herein, an interaction is 
advantageously a binding interaction. 

Where the invention is configured to detect protein-nucleic acid interaction, libraries of 
5 proteins and/or nucleic acids may be prepared as described above. Proteins having 
improved nucleic acid binding, or nucleic acid sequences having improved affinity for 
protein domains, may be developed by mutagenesis and selection of candidate sequences. 
Alternatively, protein and/or nucleic acid sequences may be used to identify, in vivo, 
cognate binding partners. 

10 

Chimeric c54 activators also offer the opportunity to better understand aspects of the 
process of transcriptional activation at o54 promoters. In the case of NifA ; t is known that 
binding of the target sequence together with ATP binding promotes oligomerisation of 
NifA. It is believed that it is the oligomer which contacts the polymerase and catalyses the 

15 ATP-driven isomerisation of the polymerase holoenzyme. Taking advantage of the 
superactivation effect described above it may be possible to address questions such as 
which components of the oligomer (e.g. the DNA-bound NifA vs. NifAAC, i.e. NifA with 
the DNA binding domain removed), which are contacting the polymerase and/or coupling 
ATP hydrolysis to transcriptional activation etc. Furthermore, usage of NifAAC cofactors 

20 from different species (together with their diversification by PCR shuffling) allows 
identification of the sequence regions critical for transcriptional activation and a 
"maturation" of the NifAAC coactivator. Indeed, we have found the NifA from H 
pneumoniae to be a superior cofactor to A vinelandii NifA. Finally, it may be possible to 
use chimera of a known DNA binding domains (e.g. GCN4) and a cDNA library as a 

25 prokaryotic "enhancer" trap, to isolate a54 activators on a genome-wide scale. 

Configuration of the invention to detect protein-protein interactions follows the general 
scheme of the yeast two-hybrid assay, and the reagents used in the invention may be set 
up accordingly. In general, therefore, the invention will comprise a nucleic acid binding 
30 domain-bait fusion, and a prey-a54 activator domain fusion. Although, in general, "bait" 
refers to a known polypeptide and "prey" to an unknown polypeptide, the terms may be 
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used interchangeably. Indeed, the invention comprises configurations in which both bait 
and prey are known, or both are unknown. 

Binding between the bait and the prey result in constitution of a hybrid protein which 
5 comprises both a nucleic acid binding domain and a a54 activator domain. The hybrid 
protein is able to activate transcription from a reporter gene, thus providing a bait:prey 
binding-dependent signal. 

Protein-protein interactions may be selected using the preferred NifA system, in which the 
10 hybrid c54 transcriptional activator includes the NifA activation domain. The NifA 
bacterial two-hybrid system may be used for the generation of interaction matrices between 
cDNA libraries. Ultimately such interaction matrices may yield an interaction map of the 
proteins of an organism. The invention provides an alternative to the yeast two hybrid 
system. 

15 

Systems based on a54 have a number of advantages over the other systems that are 
available, e.g. the conceptually similar yeast one and two-hybrid system. A bacterial host 
allows substantially larger repertoires to be obtained and thus a much larger molecular 
diversity to be screened. In particular, using combinatorial infection, the system of the 
20 invention allows the "crossing" of both a54-chimera repertoires with libraries of hybrid 
reporter constructs, thus permitting coevolution of DNA binding domains, and recognition 
sites, or coselection of DNA binding domains and target sites from genomic libraries. 

Because selection in the a54-based system is based on a positive readout, i.e. activation of 
25 transcription, it is less prone to false positives than other approaches relying on the 
inhibitory effect of the expressed DNA binding domains, like the transcription interference 
assay (Elledge SJ. et al (1989) Proc. Nat: Acad Sci USA 86, 3689). In vivo selection in 
general may result in the selection of novel DNA binding domains that are more attuned to 
working under realistic conditions, including supercoiling of the recognition site, presence 
30 of a large excess of chromosomal DNA and high protein concentration. Another 
advantage of the system of the invention is that extremely low levels of the hybrid protein 
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appear to be sufficient to affect maximum activation of transcription. This is particularly 
helpful in the case of DNA binding domains that are prone to aggregation. 

The c54-based systems of the present invention may be further adapted to take into 
5 account potential disadvantages of bacterial expression. For instance, E. coli expression 
may be suboptimal for large eukaryotic transcription factors. However, large eukaryotic 
proteins can often be split into smaller domains which retain function and are usually 
readily expressed in E. coli. 

10 According to the third configuration, a constitutively active a54 activator may be used to 
screen a library of candidate DNA-bending polypeptides, preferably in a HIF negative host. 
Since the degree of activation by the o54 activator may be dependent on DNA bending by 
additional factors, the levels of expression of the reporter gene will be modulated by the 
DNA-bending activity of the candidate DNA-bending polypeptides. 
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The invention is further described, for the purposes of illustration only, in the following 
examples. 

Examples 



20 



NifA from A. vinelandii is a well-studied member of the family of bacterial enhancers and 
it is a positive regulator of the expression of nitrogenase components in diazotrophs. It is 
inhibited by NifL in response to the presence of oxygen or ammonia. When expressed in E. 
coli, which lacks endogenous NifL or an equivalent, NifA is constitutively active. Because 
25 of the highly conserved nature of the activation mechanism of a54 RN A polymerase, NifA 
is a very strong activator of transcription in E. coli. 

Like other members of the family of bacterial enhancer proteins. NifA is modular in 
architecture, both structurally and functionally, comprising 3 domains, a N-terminal sensor 
30 domain , a central activation domain (AD), and a C-terminal DNA binding domain (DBD). 
The central activation domain (AD) can activate transcription independent of DNA 
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binding if overexpressed. Thus the DBDs function appears to be primarily to increase the 
Activator domain's concentration in the promoter proximity. 

We have exploited the modularity of the enhancer structure and swapped the natural NifA 
DNA binding domain (DBD) for heterologous DBDs and libraries thereof. Here we 
describe the activity of these NifA-chimeras in the activation of transcription from the cr54 
dependent promoter nifH and hybrids thereof. 

Materials & Methods 



10 



Media & Reagents 

2xTY, MacConkey agar are described elsewhere (Miller J.H. (1972) Experiments in 
molecular genetics. Cold Spring Harbour, NY). Antibiotics were used at the following 
concentrations: Ampicillin 0.1 mg/ml, Chloramphenicol lOug/ml, Streptomycin 25jig/ml. 

15 Min-lac medium was essentially M9 medium (Sambrook et al., Molecular Cloning: A 
Laboratory Manual, 2d ed. (1989) Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, N.Y) supplemented with ImM MgS0 4 , 20uM CaCl 2) 2% (w/v) lactose, 2mg/ml 
casamino acids, 40ug/ml L-tryptophan, 5ug/ml thiamine and appropriate antibiotics. Min- 
lacX plates where essentially M9 plates supplemented 2% lactose, appropriate antibiotics 

20 and 40ug/ml X-gal (5-bromo-4-chloro-3-indolyl-b-D-galactopyranoside). 

Strains 

TG1AK was derived from TGI (Gibson T. J. (1984) Studies on the Epstein-Barr virus 
genome, University of Cambridge) using the genome integration strategy of Haldimann A. 

25 et al.. (1996) Proc. Nat. Acad. Sci USA 93, 14361. Briefly, NifA (K. pneumoniae) 
residues 1-462 was amplified using Pfu polymerase (Stratagene) and primers 1 (5'- GAG 
TCA CTA ACG CAT ATG ATC CAT AAA TCC GAT TCG GAC -3'), 2 (5'- CGC GGA 
TCC AAG CGG CCG CTC ATT AGC GAT GGT TGA ACA GAA TCA C -3') cut with 
Ndel and BamHI and cloned into the genome targeting suicide vector pSK50D-uidA2 

30 (Haldimann, Op. Cit.) and transformed into the Pir + host strain BW23473 (Metcalf W.W. 
et al (1994) Plasmid 35, 1). Vectors were isolated and transformed into the Pir strain TGI 
harbouring the plasmid pINT-ts (Hasan N. et al (1994) Gene 150, 51). Chromosomal 
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integration was induced by a temperature shift to 42°C which leads to expression of X 
integrase from pINT-ts and simultaneously stops its replication. Integrants where identified 
by Kanamycin resistance and screened for Nif coactivation. Once obtained TG1AK. was 
grown routinely without antibiotic selection. 



Constructs 



Chimeric constructs were based on pDB737 (Austin S. et al ( 1 994) J. Bacterial. 176, 3460 
Buck M. et al (1986) Nature 320, 374) encoding NifA (A. vmelandii) under the control of 
the T7 promoter in the plasmid pT7-7 (Tabor S. & Richardson C.C. (1985) Proc Natl 
AcadSci USA 82. 1074). Expression was by leakiness of the T7 promoter. Chimeras were 
constructed taking advantage of an unique Banll cutting site, in the linker region between 
the central domain of NifA and the DBD. GCN4 was amplified using Pfu polymerase 
(Stratagene) and primers 3 (5'- OCT GCC AGC GAG AGC CCG CCG CTC GCC GCG 
ATT GTG CCC GAA TCC AGT GAT CCT -3') and 4 (5'- GAG CTA AAG CTT TTA 
TTA GCG TTC GCC AAC TAA TTT CTT TAATCT GGC -3') cut with Banll and 
ffind3 and ligated into P DB737 cut with Banll and Hind3. ERDBD was amplified using 
primers 5 (5'- GTC GAC AAC GAG AGC CCG CCG CTC GCC GCG GAA ACG CGT 
TAC TGC GCT GTT -3') TGC and 6 (5'- GGT CAG CGC GTG GAT CCT TAA CCA 
CCA CGA CGG TCT TTA CG-3') cut with Banll and BamHI and ligated into pDB737 
cut with Banll and BamHI. The vector P 737S1 is derived from pDB737 by replacing the 
bla gene with aadA conferring streptomycin resistance and the insertion of a fl phage 
origin for packaging of the vector into filamentous phage particles. Briefly, aadA was 
amplified using primers 7 (5'- TCA GCG CAC GCT GAC GTC GTG GAA ACG GAT 
GAA GGC ACG AAC -3'), 8 (5'-CCO CCT GGA GGT GGC CAT TAT TTG CCG ACT 
ACC TTG GTG ATC TCG CC -3') and cut with Aatll and MscI and ligated with 
pDB737 cut with Aatll and Seal. The resulting vector P 737S was cut with Aatll, Clal. The 
fl ori was amplified using primers 9 (5'- GCT GCC GAC TCG ATC GAT GAA TGG 
CGA ATG GCG CCT GAT GCG G -3'), 10 (5'-CCG GGT CGT GAC GTC AGT GTT 
GGC GGG TGT CGG GGC TGG C -3') cut with Aatll, Clal and cloned into the cut 
p737S to give p737Sl. NifA-X chimera were transferred from pDB737 to p737Sl by 
digestion with Ndel . Hind3 (BamHI for NifA-ERDBD). 
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Reporter constructs were derived from pACYC184 and the vector pMBl (Buck M. m et al. 
(1986) Nature 320. 374). Briefly the lac-operon (lacZYA) was amplified with primers 11 
(y- GAG TCA ATT CGG GGA TCC CGT CGT TTT ACA ACG TCG TGA CTG G-3'), 
5 12 (y- GAG TCA TTC TGG CCA GTC GAC CGC TCT GCC GGT GGT TAC -3") and 
' cut with BamHI and Mscl. The niflH promoter segment from pMBl was amplified with 
primers 13 (5 *- GAG TCA TTC AAG CTT GCG TGG AAT AAG ACA CAG GGG 
GCG-3'), 14 (5'- GAG TCA TTC GGG ATC CCC GGA TTT ACC GAT ACC GCC 
Tit ACC -3 ? ) and cut with Hind3, BamHI and the 2 fragments simultaneously ligated 
10 with pACYC184 cut with Hind3 and BsaAl to give pMB3. The fl ori was amplified with 
primers 15 (T- GCT GCC GAC TCG GCT AGC GAA TGG CGA ATG GCG CCT GAT 
GCG G -3'), 16 (GCC GGG TCG CTT TAA AGT GTT GGC GGG TGT CGG GGC 
TGG C -3') and cut with Nhel and Dral and ligated into pMB3 cut with both Nhel, XmnI 
togivepMB31. 

15 

Selection and screening 

Cells were cotransformed either by simultaneous or sequential electroporation with an 
expressor construct and a reporter, construct and grown overnight with appropriate 
antibiotic selection at 34°C in M9-lac medium and plated out. P-gal expression was scored 
20 either on MacConkey or Minlac-X-gal indicator plates or by ONPG enzyme assay of 
selected colonies (see below); 

Enzyme assay 

ONPG assays used to measure p-gal activity were essentially as described by Kolmar H. 

25 et al. (1995) EMBO J 14, 3895. Briefly, 20^1 of an overnight culture is transferred to a 
microtitre well and 100^1 of chloroform saturated Z-buffer (lOOmM NaHP0 4 , ImM KCL, 
ImM MgS0 4 , 50mM p-mercaptoethanol, pH 7.0 (Miller J.H. (1972) Experiments in 
molecular genetics, Cold Spring Harbour, NY) was added and the optical density at 600nm 
determined using an ELISA reader. Cells were lysed by addition of 50^1 Z-buffer with 

30 0.4% (w/v) SDS and incubated at 30°C for 10 min. 50^1 of Z-buffer with 4mg/ml O- 
nitrophenyl-P-D-galactopyranoside were added and the optical density at 420nM was 
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recorded automatically every 15s over a period of 60min. Specific p-gal activity was 
calculated from the V max as in Miller (Op. Cit.). 

5 Example 1: NifA chimera with heterologous DNA binding domains activate 
transcription but only from promoters with a cognate recognition site 

To investigate in what way transcription activation by NifA was dependent on the NifA 
DNA binding domain (DBD) and on native nif promoter structure, we prepared Nifc\- 

10 chimeras in which NifA DNA binding domain (DBD) had been replaced by heterologous 
DBDs of diverse structural architectures. Initially we explored DBDs which, like the NifA 
wild type (wt) DBD bind to symmetrical DNA recognition sequences such as the basic 
leucine zipper (bZIP) DBD of the yeast transcription factor GCN4, the Zn-finger domain 
of the human estrogen receptor DNA binding domain (ERDBD) and determined their 

15 capacity to activate transcription of a lacZ reporter gene in vivo from a hybrid nifH 
promoter, in which the NifA UAS had been deleted and replaced by recognition sites for 
the heterologous DBDs. 

In order to simplify comparison of transcription activation by NifA chimeras with 
20 activation by wt NifA, all reporter constructs had a single DNA recognition site. The wt 
nifH promoter UAS contains three bona fide NifA recognition sites. Deletion of the two 
sites more distal to the promoter, however, did not appear to reduce transcription 
activation in our reporter under conditions tested. 

25 Transcription activation by NifA-chimeras was specific in that they only activated lacZ 
expression from hybrid-promoters bearing their cognate recognition sequences but not 
from control reporter constructs bearing wild type UAS or a non-cognate site (Fig. 3). In 
analogy to wt NifA the presence of two or more recognition sites (in phase, see below) did 
not increase activation by the Nif-GCN4 chimera. 

30 

Activity was also dependent on the phasing of the recognition site with respect to the 
promoter: when the symmetric ATF/CREB recognition site for GCN4 was offset in 
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increments of 1 bp r optimal activity was observed when the ATF/CREB was centred on 
the same bp as the symmetric wt UAS. Presumably efficient contact with the RNA Pol 
holoenzyme requires that the activator be bound on the right face of the DNA. 

5 Transcription activation by NifA-chimeras appears to preserve fine specificity of isolated 
DBDs. Wild-type GCN4 binds with equal affinity to the symmetric ATF/CREB site as 
well as to the pseudo-symmetric AP-1 site. Indeed, the NifA-GCN4 chimera showed 
identical levels of transcription activation in reporter constructs with either of these sites 
(Fig. 3). A NifA-ERDBD chimera showed strong activity on a reporter with its cognate 
10 ERE site but no activity above background levels with reporters bearing the similar GRE 
recognition site for the closely related glucocorticoid receptor DBD (Fig. 4). 

Example 2: Coexpression of wild-type NifA with NifA-chimeras boosts specific 
transcription activation by NifA chimeras in a specific and DNA independent manner 

15 

The level of transcription activation by the NifA-GCN4 and NifA-ERDBD chimeras was 
lower (ca. 10%) than for wt NifA. However, near wt levels of activity (up to 80%) were 
reached when wt NifA was coexpressed within the same cell as a "coactivator". 

20 The coactivation was independent of DNA binding, as NifA variants in which the DBD 
had been deleted (NifAAC) was found to be just as active as wt NifA. On the other 
coexpression of an isolated NifA central domain (both the DBD as well as the N-terminal 
sensor domain deleted (NifAANC)) failed to coactivate. NifA derived from different 
species showed greatly variable efficiencies as coactivators. NifA variants from K. 

25 pneumoniae (NifA Kp, NifAAC Kp) were almost three times as effective as NifA, while 
NifA variants from Rhizobium (NifA Rhl, NifA Rh2) were poorly active as coactivators 
(Fig. 5). 

The coactivator effect was found to enhance only specific transcriptional activation and not 
30 background levels of transcription from promoters with non-cognate recognition sites. We 
therefore constructed an £ coli strain, expressing NifAAC Kp (the K. pneumoniae NifA 
with its DBD deleted) from a weak promoter (phoB) from the chromosome (TGI :AK). 
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The coactivation effect has analogies in eukaryotic transcription, for example the enhancer 
SpU in which isolated Spl activation domains can stimulate transcriptional activation by 
the DNA binding-form of Spl, a phenomenon termed "superactivation n . 

5 

Example 3: Tethering of NifA chimera at the UAS is sufficient for activation, but strong 
activation requires correct positioning 

10 We also investigated transcription activation by NifA-chimeras with asymmetrical 
recognition sites such as the classic Zn-finger Zi068 as well as the DBD from p53. 

Both NifA-ZiG68 and NifA-p53 chimeras activated transcription, but only at low levels (2 
- 5-fold above the background). However, when the Zif recognition site was duplicated, to 
15 give a symmetric palindromic site transcription activation increased substantially. Non- 
palindromic duplication of the recognition site in tandem did not increase activation. 

Thus while simple tethering is sufficient for some activation, only bipartite binding 
appears to give a strong activation. Presumably, tethering only leads to an approximate 
20 positioning of the activation domain with respect to the RNA polymerase holoenzyme, 
thereby reducing the likelihood of a productive interaction. 

Example 4: Selection of active NifA-chimeras by lac complementation 

25 

Using expression of the lac operon (lacZYA) from our reporter construct as the read-out 
of transcription activation allows the selection of active NifA-chimera on the basis of 
metabolic complementation of a Mac strain, with lactose as the only carbon source. 
Initially we spiked populations of NifA-ERDBD with NifA-GCN4 at the ratios 1/10 4 , 
30 1/10 6 in the presence of the GCN4 cognate reporter ATF/CREB-nifH and grew 
populations overnight in minimal medium supplied with lactose. Pre- and post selection 
populations were scored by plating on MacConkey-lactose plates as well as by PCR 
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screening. The results are summarised in Table I. Selection factors of up to 10,000 -fold 
per round were observed. . 

Table 1 Selection factors for Nif selection by lac complementation 

5 

NifGCN4/NifERDBD 
1/10 4 
1/10 5 
1/10 6 
1/10 7 



Selection factor 
40 fold 
40 fold 
200 fold 
4000 fold 



Example 5: Selection of active NifA-chimeras by flow cytometry. 

1 0 Expression of (J-galactosidase (lacZ) as the read-out of transcription activation allows the 
selection of active NifA-chimera on the basis of metabolic complementation of a Alac 
strain, grown on lactose as the only carbon source. However, metabolic selection 
predisposes the system to the generation of false positives. Presumably, the prolonged 
growth under metabolic selection selects for mutant promoters, active in the absence of a 

15 cognate enhancer. 

We have observed that that this only occurs for library sizes exceeding 10 6 . Indeed, others 
have found (using a related bacterial two-hybrid system) that it is not possible to retrieve 
positive clones from dilutions higher than 1/10 6 by metabolic lac selection (G. Karimova, 
20 et al., (1998) Proc Natl Acad Sci USA 95, 12532-7). As it is well known that bacteria can 
develop a mutator phenotype under adaptive stress (P. D. Sniegowski, P. J. Gerrish, R. E. 
Lenski, (1997) Nature 387, 703-5), we conclude that it is preferable to separate the 
selection from the amplification (growth) step in order to reduce the likelihood of 
. reyertants. 

25 
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We thus replaced lacZ with the Aequoraa victoria green fluorescent protein (EGFP. F64L, 
S65T. ex488 nm. em527nm (the Cloritech variant P GFPmut3.l. S65E . S72A. ex501nm. 
em51 lnm FACS optimised variant was also tried, but found inferior) as the reporter gene. 
GFP has the advantage that cells can be grown first and then separated on the basis of 
fluorescence using fluorescence activated cell sorting (FACS). 

We prepared a trial library of mutant GCN4 bZIP DBDs (library sizelO 8 ) in which 5 key 
residues (Asn235, Ala238. Ala239. Ser242, Arg243) of GCN4 interacting with DNA were 
randomised and selected it against a GFP hybrid reporter with the cognate ATF/CREB 
site. Library populations were grown overnight at 34°C in non-fluorescent medium NFM 
(minimal medium supplied with 2%glucose. 0.2% casaminoacids.12 ng/ml L-Trp). For 
FACS (Cytomation Mofo, 488 nm Laser. FL-1 530/40 filter) an 1 ml aliquot was diluted 
10X in NFM and the top 1% fluorescent cell population was sorted into a 96 well plate at 
1 cell per well, and grown up overnight at 34°C. Cell fluorescence of the grown up clones 
was measured, by using a SPECTRAmax R GEMINl Dual-Scanning Microplate 
Spectrofluorometer (Molecular Devices). ex480, em520, (cut-off 515 nm). Plasmids from 
fluorescent wells were sequenced afterwards. Pre- and post selection populations were 
also scored by PCR screening as well as by plating on min glu (M9 Minimal medium + 
glucose) plates and visualised using fluorescence microscope. 

10 5 cells were sorted in total, from which 219 cells were in the top 1% fluorescent 
population and 132 of which were captured to the 96-well plates. 13 cells from these were 
fluorescent Selected positives were checked by separating the mutant GCN4-bZIP DBD 
expressor plasmids. and re-transforming them together with cognate and non-cognate 
reporter plasmids. None of the selected positives gave a fluorescent signal when 
combined non-cognate reporter plasmids, but all were fluorescent when combined with 
the ATF/Creb cognate reporter plasmid (which did not produce any fluorescence when 
transformed on its own). 

This indicates that GFP selection indeed avoids the isolation of false positives. 
Furthermore, when the library was checked prior to FACS sorting no fluorescent clones 
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were identified when plating > 1 0 7 cells. 1/10 clones plated post selection were 
fluorescent, suggesting a selection factor in a single round in excess of I0 6 -fold. 

All publications mentioned in the above specification are herein incorporated by 
reference. All database sequences denoted by accession or gi numbers are likewise 
incorporated by reference. 

Various modifications and variations of the described methods and system of the 
invention will be apparent to those skilled in the art without departing from the scope and 
spirit of the invention. Although the invention has been described in connection with 
specific preferred embodiments, it should be understood that the invention as claimed 
should not be unduly limited to such specific embodiments. Indeed, various modifications 
of the described modes for carrying out the invention which are obvious to those skilled in 
molecular biology or related fields are intended to be within the scope of the following 
claims. 
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Claims 

1. A method for detecting a protein-nucleic acid interaction between a acid molecule 
and a protein molecule, comprising the steps of: 

5 a) providing one or more hybrid c54 activator proteins comprising a heterologous 

nucleic acid binding sequence and a constitutively active a54 transcription activating 
domain; 

b) providing one or more nucleic acid molecules comprising a binding site for the 
nucleic acid binding sequence and a binding site for a54 RNAP, which directs the 

1 0 expression of a reporter gene and leads to upregulation thereof in response to activation by 
the a54 transcription activating domain; and 

c) detecting expression of the reporter gene. 

2. A method according to claim 1, comprising providing a repertoire of hybrid ct54 
15 activator proteins, said repertoire comprising a plurality of different nucleic acid binding 

sequences. 

3. A method according to claim 1 , comprising providing a repertoire of hybrid nucleic 
acid molecules, said repertoire comprising a plurality of different binding sites for the 

20 nucleic acid binding sequence. 

4. A method according to claim 1, comprising providing both a repertoire according 
to claim 2 and a repertoire according to claim 3. 

25 5. A method for detecting a protein-protein interaction, comprising the steps of: 

a) providing a first hybrid protein comprising a nucleic acid binding sequence and a 
first polypeptide sequence bait; 

b) providing a second hybrid protein comprising a prey polypeptide sequence and 
constitutively active cr54 transcription activating domain; 

30 c) providing a nucleic acid molecule comprising a binding site for the nucleic acid 

binding sequence and binding site for ct54 RNAP which directs the expression of a 
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reporter gene and leads to upregulation thereof in response to activation by the a54 
transcription activating domain; 

d) incubating the first and second hybrid proteins together with the nucleic acid 
molecule such that the prey and bait polypeptide sequences may bind, thereby forming a 

5 hybrid protein comprising both a nucleic acid binding sequence and a a54 transcription 
activating domain; and 

e) detecting expression of the reporter gene. 

6. A method according to claim 5. comprising providing a repertoire of first hybrid 
10 proteins, said repertoire comprising a plurality of bait polypeptides. 

7. A method according to claim 5, comprising providing a repertoire of second hybrid 
proteins, said repertoire comprising a plurality of prey polypeptides. 

15 8. A method according to claim 5. comprising providing a repertoire of first hybrid 
proteins and a repertoire of second hybrid proteins, said repertoires comprising a plurality 
of bait and prey polypeptides. 

9. A method for screening a repertoire of candidate DNA-bending 
20 polypeptides, comprising the steps of: 

a) providing a repertoire of candidate polypeptide factors with potential to induce 
bending of DN A; . 

b) providing a a54 activator protein comprising a nucleic acid binding sequence 
and a a54 transcription activating domain; 

25 c) providing a nucleic acid molecule comprising a binding site for the nucleic acid 

binding sequence and binding site for a54 RNAP which directs the expression of a 
reporter gene and leads to upregulation thereof in response to activation by the a54 
transcription activating domain; 

d) incubating the repertoire and a54 activator together with the nucleic acid 

30 molecule in a HIF* host cell, such that a54 activator and the nucleic acid molecule may 
interact, and transcription activated from the a54 RNAP binding site in a manner 
dependent on DNA bending by the polypeptide factors; and 
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e) detecting expression of the reporter gene. 

10. A method according to any preceding claim, wherein the polypeptides are obtained 
by expression within a bacterial host cell. 

11. A method according to claim 10. wherein the polypeptides are encoded one or 
more libraries of nucleic acid vectors. 

12. A method according to claim 11, wherein a first library of nucleic acid vectors 
10 encodes a first chimeric gene, said gene comprising a nucleic acid sequence that encodes a 

nucleic-binding domain and a nucleic acid sequence encoding a first (bait) test protein or 
protein fragment in such a manner that the first test protein is expressed as part of a hybrid 
protein with the nucleic acid-binding domain. 

15 13. A method according to claim 1 1, wherein a second library of nucleic acid vectors 
encodes a second chimeric gene, said gene comprising a nucleic acid sequence that 
encodes a o54 transcriptional activation domain and a second (prey) test protein or protein 
fragment into the vector, in such a manner that the second test protein is capable of being 
expressed as part of a hybrid protein with the transcriptional activation domain. 

20 

14. A method according to any preceding claim, wherein the o54 transcriptional 
activator is selected from the group consisting of: 

dbj|BAA16379.1| (D90877) FORMATE HYDROGENLYASE TRANSCRIPTIONAL 
ACTIVATOR; 

25 emb|CAA26472.1| (X02616) pot. Nifa gene product (aa 1-484) [Klebsiella pneumoniae]; 

emb|CAA53584.1| (X75972) anfa [Rhodobacter capsulatus]; 

embicAA92413.1| (Z68203) nifa homologue [Rhizobium sp.]; 

emb|CAA93242.1| (Z69251) mopr [Acinetobacter calcoaceticus]; 

emb|CAB53157.1|(X07567) nifal [Rhodobacter capsulatus]; 
30 emb|CAB56537.1| (AJ249642) response regulator [Pseudomonas stutzeri]; 

gb|AAA58220. 1 1 (U 1 8997) ORF_o532 [Escherichia coli]; 

gb|AAA99303.1| (L43064) regulatory protein [Pseudomonas aeruginosa]: 
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gb|AAB9 1397.1 1 (AF033203) nifaii protein [Rhodobacter capsulatus]; 
gb|AAC05586.1| (AF006075) regulatory protein [Bacillus subtilis]: 
gb|AAC37 1 24. 1 1 (L8 1 1 76) fleq [Pseudomonas aeruginosa]; 
gb|AAC45640.1| (AF010585) putative sigma 54 activator [Caulobacter crescentus]; 
5 gb|AAC46367.1| (AF0141 13) two-component response regulator [Vibrio cholerae]; 

gb|AAD34591.1|AF145956_l (AF1 45956) transcriptional activator nifa [Rhodospirillum 
rubrum]; 

gb|AAD38416.l| (AF155934) nifa [Alcaligenes faecalis]; 
gb|AAF28395.1| (AF069392) flam [Vibrio parahaemolyticus]; 
10 gb|AAF33506.1| (AF1 70176) Salmonella typhimurium transcriptional regulatory protein; 
gb|AAF6 1932.1 1 (AF230804) sigma-54 activator protein Actl [Myxococcus xanthus]; 
gb|AAF85342.1|AE004061_7 (AE004061) two-component system, regulatory protein 
[Xylella fastidiosa]; 

gb|AAF94676.1| (AE004230) sigma-54 dependent response regulator [Vibrio cholerae]; 
15 gb|AAF95280.1| (AE004286) sigma-54 dependent response regulator [Vibrio cholerae]; 
gb|AAF96095.i| (AE004358) sigma-54 dependent transcriptional regulator [Vibrio 
cholerae]; 

gb|AAG0 1527.1 1 AF288483J (AF288483) nifa [Azospirillum brasilense]; 

pir||A48291 ornithine decarboxylase inhibitor - Escherichia coli; 
20 pirl|B49940 nitrogen regulator I homolog - Escherichia coli; 

pir||C70320 transcription regulator nifa family - Aquifex aeolicus; 

pir||C70396 transcription regulator ntrc family - Aquifex aeolicus; 

pir||C70454 transcription regulator ntrc family - Aquifex aeolicus; 

pir||D703 15 transcription regulator ntrc family - Aquifex aeolicus; 
25 pir||H695 8 1 transcription activator of acetoin dehydrogenase operon acor - Bacillus 

subtilis; 

pir||I39719 nitrogen regulatory protein - Agrobacterium tumefaciens; 
pir||JC5471 regulatory protein nifa - Azospirillum lipoferum; 
pir||T08624 probable ntrc-type response regulator - Eubacterium acidaminophilum; 
30 sp|P03027|NIFA_KLEPN NIF-SPECIFIC REGULATORY PROTEIN; 
sp|P09570|NIFA_AZOVI NIF-SPECIFIC REGULATORY PROTEIN; 
sp|P12627|VNFA_AZOVI NITROGEN FIXATION PROTEIN VNFA; 
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sp|P 1 4375|H YDG_ECOLl TRANSCRIPTIONAL REGULATORY PROTEIN HYDG: 
sp|P21712|YFHA_ECOLI HYPOTHETICAL 49.1 KD PROTEIN IN GLNB-PURL 
INTERGENIC REGION (ORFXB); 

sp|P24426|N!FA_RHILT NIF-SPECIFIC REGULATORY PROTEIN; 
5 sp|P25852|HYDG_SALTY TRANSCRIPTIONAL REGULATORY PROTEIN HYDG; 

sp|P27713|NIFA_HERSE NIF-SPECIFIC REGULATORY PROTEIN; 
sp|P30667|NIFA_AZOBR NIF-SPECIFIC REGULATORY PROTEIN; 
sp|P38035|RTCR_ECOLI TRANSCRIPTIONAL REGULATORY PROTEIN RTCR; 
sp|P54929]NIFA_AZOLI NIF-SPECIFIC REGULATORY PROTEIN; 
1 0 sp|P56266|NIFA_KLEOX NIF-SPECIFIC REGULATORY PROTEIN; 

sp|Q06065|ATOC_ECOLI ACETOACETATE METABOLISM REGULATORY 
PROTEIN ATOC (ORNITHINE/ARGININE; 

spiQ46802|YGEV_ECOLI HYPOTHETICAL SIGMA-54-DEPENDENT 
TRANSCRIPTIONAL REGULATOR IN; 
15 sp|Q53206[NIFA_RHISN NIF-SPECIFIC REGULATORY PROTEIN; and 

sp|Q9Z!B7|TYRR_ERWHE TRANSCRIPTIONAL REGULATORY PROTEIN TYRR. 

15. A method according to any one of claims 1 to 14, wherein the cr54 transcriptional 
activator is the NifA transcriptional activator or the PspF transcriptional activator. 

20 

16. A method according to any one of claims 1 to 14, wherein the hybrid o54 
transcriptional activator is NifA and activation resulting from NifA-c54 RNAP interaction 
is enhanced by the coexpression of wild-type or mutant NifA. 

25 17. A method according to claim 1 6, wherein the hybrid a54 transcriptional activator is 
NifA from Azotobacter vinelandii, and the wild-type or mutant NifA is NifA from 
Klebsiella pneumoniae. 

18. A method according to any preceding claim, wherein the nucleic acid molecule 
30 comprises a binding site for a factor which induces DNA bending. 

19. A method according to claim 1 8, wherein the factor is integration host factor (IHF). 
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20. A method according to any one of claims I to 17, wherein the nucleic acid 
molecule comprises DNA that is intrinsically bent. 

5 21. A method according to any preceding claim, wherein the nucleic acid molecule 
!. comprises a nifH promoter from A. vinelandii driving a reporter gene. 

22. A method according to any preceding claim, wherein the reporter gene is selected 
from the group consisting of metabolic markers such as the lac operon (lacZ, lacY and 

10 lacA): proteins conferring a fluorescent phenotype, such as GFP; proteins conferring 
antibiotic resistance, such as Zeo: and proteins conferring another selectable property. 

23. A method according to any preceding ciaim, which is carried out in the presence of 
a compound which modifies protein-protein or protein-DNA interaction. 

15 

24. A method according to claim 22, wherein the compound is selected from the group 
consisting of molecules which alter the structure of the DNA-binding protein; molecules 
which alter the structure of DNA; and molecules which modify protein-protein 
interactions. 

20 

25. A method according to any preceding claim, which is carried out in vivo. 

26. A method according to claim 25, wherein the in vivo host is E. colL 

25 27. A method according to any one of claims 1 to 24, which is carried out in vitro. 
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Figure 3 
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□ NifGCN4/AP-l +NifADC 
■ NifGCN4/UAS +NifADC 



SUBSTITUTE SHEET (RULE 26) 



o 



WO 01/18244 PCT/GB00/03450 



4/5 



Figure 4 
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Figure 5 
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